Finite-state models for lexical reordering in spoken language translation

نویسندگان

  • Srinivas Bangalore
  • Giuseppe Riccardi
چکیده

The problem of machine translation can be viewed as consisting of two phases: (a) lexical choice phase where appropriate target language lexical items (words or phrases) are chosen for each source language lexical item and (b) reordering phase where the chosen target language lexical items are reordered to produce a meaningful target language string. In earlier work we have shown that nite-state models for lexical choice can be learned from bilingual corpora [6]. In this paper, we focus on stochastic nite-state models for lexical reordering and describe an algorithm to learn them from bilingual corpora. We have developed a stochastic nite-state English-Japanese translation system by composing nitestate lexical choice and lexical reordering model. We have evaluated it using the string edit distance of the translated string from a given reference string. Using this metric, the English-Japanese translation system scored 70.9% on English speech transcriptions.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Generalizing Word Lattice Translation

Word lattice decoding has proven useful in spoken language translation; we argue that it provides a compelling model for translation of text genres, as well. We show that prior work in translating lattices using finite state techniques can be naturally extended to more expressive synchronous context-free grammarbased models. Additionally, we resolve a significant complication that non-linear wo...

متن کامل

A Finite-State Approach to Machine Translation

The problem of machine translation can be viewed as consisting of two subproblems (a) Lexical Selection and (b) Lexical Reordering. We propose stochas-tic nite-state models for these two subproblems in this paper. Stochastic nite-state models are ee-ciently learnable from data, eeective for decoding and are associated with a calculus for composing models which allows for tight integration of co...

متن کامل

A Hybrid Machine Translation System Based on a Monotone Decoder

In this paper, a hybrid Machine Translation (MT) system is proposed by combining the result of a rule-based machine translation (RBMT) system with a statistical approach. The RBMT uses a set of linguistic rules for translation, which leads to better translation results in terms of word ordering and syntactic structure. On the other hand, SMT works better in lexical choice. Therefore, in our sys...

متن کامل

The RWTH Aachen German to English MT System for IWSLT 2015

This work describes the statistical machine translation (SMT) systems of RWTH Aachen University developed for the evaluation campaign of the International Workshop on Spoken Language Translation (IWSLT) 2015. We participated in the MT and SLT tracks for the German→English language pair. We employ our state-of-the-art phrase-based and hierarchical phrase-based baseline systems for the MT track. ...

متن کامل

A Reordering Approach for Statistical Machine Translation

This paper presents a Markov based hierarchical reordering scheme for lexical reordering to incorporate into phrase-based statistical machine translation system. The goal is to reorder the words and phrases in source language syntactic structure into their corresponding target language syntactic order for making translation easy. Without reordering during language translation, sentences can onl...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2000